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Abstract -Boolean networks serve as discrete models of regulation and signaling in biological 
cells. Identifying the key controllers of such processes is important for their understanding and 
planning further analysis. We quantify the dynamical impact of a node as the probability of 
damage spreading after switching the node's state. The leading eigenvector of the adjacency matrix 
is a good predictor of dynamical impact in case of long-term spreading. Quality of prediction is 
further improved when eigenvector centrality is based on the weighted matrix of activities rather 
than the unweighted adjacency matrix. Simulations are performed with random Boolean networks 
and a model of signaling in fibroblasts. The findings are supported by analytic arguments from a 
linear approximation of damage spreading. 



Introduction. — Boolean networks are coarse- 
grained models of the regulatory dynamics that controls 
the survival and proliferation of a living cell [l}|4]. The 
dynamics is time- and state-discrete. This Boolean ab- 
straction assumes that small differences in concentration 
levels are irrelevant. The binary distinction of a low or 
a high concentration of each bio-molecule is sufficient to 
capture the dynamics. 

A purely theoretical branch of studies is devoted to ran- 
domly constructed Boolean networks [5]|6] and strives to 
elucidate generic features of Boolean dynamics. From the 
perspective of statistical mechanics, averaged macroscopic 
quantities in the limit of large system size are described 
in dependence of ensemble parameters such as the proba- 
bility distribution of the employed Boolean functions [7||8] 
and the degree distributions of the networks [9j[To] . The 
number of attractors (ergodic subsets of the state space) 
[2 11 13 and the stability under perturbations [9 14 -20 



ference from data by a dedicated algorithm 29 



have been investigated. The underlying fundamental re- 
sult is a transition between convergent (stable) and diver- 
gent (unstable) dynamics when the input sensitivity of the 
Boolean functions passes a critical value [T4{[2T] . 

In recent years, the theory of random ensembles has 
been complemented by case studies showing that suit- 
ably constructed Boolean networks capture the behaviour 
of empirical regulatory systems |22 -26 . These system- 



specific Boolean networks are obtained by compiling bio- 



chemical interactions from the literature 27 , by discretiz- 
ing existing models of differential equations (28| , or by in- 



With the advent of system-specific Boolean models, new 
conceptual questions and analytical and numerical chal- 
lenges arise. In particular, the response of the system to 
external intervention may be quantified in a more detailed 
manner than an averaging over all eligible perturbations. 
Since each node now represents a specific biochemical en- 
tity, a node's individual impact on the dynamics is of in- 
terest. The prediction of nodes' impacts from the model 
may be compared to biological experiments. It is expected 
to trigger additional experiments and lead to improvement 
of models. 

The goal of this contribution is to establish a formal 
notion of node impact in Boolean dynamics and its rela- 
tion to a node's topological position in the network. We 
perform a linear approximation of the long-term effect of a 
perturbation at a specific node i. We find that, in good ap- 
proximation, the expected impact is monotonically related 
to the entry of i in the leading eigenvector of the adjacency 
matrix. When not only the network structure but also the 
Boolean functions are known, the estimate is improved by 
replacing the adjacency matrix with a weighted matrix of 
the activity values derived from the functions. The ana- 
lytic approximations are validated by numerical studies of 
random Boolean networks and an empirical network from 
the literature. 

Boolean networks. — A Boolean network is a state- 
and time-discrete dynamical system. The dynamics is de- 
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Fig. 1: Probabilistic description of damage spreading in a Boolean network. The estimated damage probability Pi{t) for a node 
i at time t is indicated by the height of the shaded area. At time t = 0, the upper node is perturbed, thus having a damage 
probability 1. Neglecting correlations, the probability that a damage spreads from a node i to a node j is the activity Oij as a 
label on each connection i — >■ j. Note that the case of more than one perturbed input, such as for the node with the Nor-function, 
is not captured by the activities. In the analytic treatment, we assume linear superposition of damage probabilities. The node 
performing Nor has an estimated damage probability (l/2)(l/2) + 1(1/2) — 3/4 at time t — 2. 



fined by an iteration 



x{t + l)=fix{t)) 



(1) 



with N Boolean dynamical variables written as a binary 
vector x{t) e {0, 1}^ at each time teNU {0}. The map- 
ping / : {0,1}^ — ^ {0,1}''^ is typically sparse: calculating 
the state Xj{t+1) requires knowledge of the state Xi{t) for 
a few (<C N) indices i at the previous time step. When 
the system is pictured as a directed network, the nodes 
{1, 2, . . . , N} carry the dynamic variables Xi, X2, ■ ■ ■ , xpf 
interacting along a relatively small number of directed 
arcs. Subindices address components of a vector such that 
Xj is the Boolean state of node j and fj is its Boolean 
function. 

In order to formalize and quantify these ideas, we con- 
sider the Xi-dependence of / as the mapping 



1 if/,(x)^/,(a;^0 
otherwise 



(2) 



This is the Boolean analogue of the usual partial derivative 
of a function, using a;^* to denote state vector x with its i- 
th entry negated. Note that 9'^*^/ also maps from {0, 1}^ 
to {0,1}''^. By averaging 9'-*'/ over all states with equal 
weight, the activity of i on j is obtained as 



(3) 



xe{a,i}'' 



The activity aij{f) is the probability that a perturbation 
(negation of state) at node i causes a perturbation at node 
j in the subsequent time step, assuming that all 2^ state 
vectors occur with equal probability. The sensitivity of the 
Boolean function fi is the sum of its incoming activities, 



N 



(4) 



Likewise, we define the strength of fi as the sum of outgo- 
ing activities 



N 



(5) 



The directed network on the nodes {1, 2, . . . , A''} ob- 
tained from / contains an arc from node i to node j if 
and only if aij{f) =/= 0. The adjacency matrix A of the 
network has an entry = 1 if aij{f) ^ and a^- — 
otherwise. 

Dynamical impact. — So far we have considered the 
average effect of a flip perturbation at the input i of a 
Boolean function fj on the output. Now we ask about the 
long-term behaviour of the whole system after a pertur- 
bation. We define 



H,{t) = {x e {0,1}'' : fix) ^ f{x^^)} 



(6) 



as the set of initial conditions such that a perturbation at 
node i spreads at least until time t. Then the fraction of 
such combinations 



mt)\ 

ON 



(7) 



out of all possible ones is the probability that the damage 
spreads for at least t steps after perturbing node i. We call 
hi{t) the dynamical impact oi node i for t steps. Dynamical 
impact strongly varies across nodes of a given network. 
For instance, the ratio between the largest and the average 
impact is 20 ± 4 at i = 100 on random Boolean networks 
with parameters N — 500, K — 2 and (s) — 1. 

Let us find an analytic approximation for hi{t) at long 
times t. By Pi{t) we denote the probability that node 
i carries a damage at time t, i.e. the probability that 
[f*'ix)]i ^ [/*(x^*)]i. After the perturbation has spread 
for at least one time step, the damages and also the un- 
perturbed states are correlated across nodes in general. 
Then the single- node probabilities pt (t) are insufficient for 
an exact description of the spreading probabilities. Here 
we make an approximation by neglecting the correlations. 
Then the damage probabilities follow the equation 



N 



Pi 



(t) (X ^a.yp,(t- 1) . 



(8) 



This equation is exact if the network, seen downstream 
from the initially perturbed node, is a directed tree. Then 
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Table 1: Centrality measures considered as predictors for the 
dynamical impact hi{t). 



I range I 


adjacency matrix A 


activity matrix H 


local 


out- degree (di) 


strength (cr,;) 


global 


eigenvector (e^) 


eigenvector (e^) 



at most one term in the summation is non-zero. Otherwise 
Eq. ([s]) serves as an approximation assuming a roughly 
linear accumulation of the damage. Figure [T] provides an 
illustration. In a more compact notation, Eq. (|8| reads 

p{t) = - 1) (9) 

using the transpose of the activity matrix H = {aij)ij. 
Iteration from the initial condition yields 

Pit) = (HT)V(O) . (10) 

In the limit of large t, the projections on the (left and 
right) eigenspaces of the leading eigenvector of H domi- 
nate the behaviour of p. Assuming that H is irreducible, 
non-negativity ensures that these eigenspaces are one- 
dimensional by the Perron-Frobenius theorem. Then we 
find unique normalized right and left principal eigenvec- 
tors e' and e of H with non-negative entries. In this ap- 
proximation by the dominant eigenspaces, the evolution 
of p reads 

p{t)^X\e'®e)p{0) (11) 

with the dyadic product of e and e' and the largest eigen- 
value A. According to Equation ( |Tl| ), the projection of 
the initial damage probability p(0) on the eigenvector e 
is what determines the expected damage at long time t. 
In other words, is indicative of the long-term damage 
expected from a perturbation at node i in the linearized 
treatment with suppression of correlations. 

To which extent does this asymptotically expected dam- 
age amplitude Ci inform us about the probability hi (t) that 
the perturbation spreads for a long time tl In the following 
sections we investigate the question by simulations. Often 
the network structure is known but information on the 
Boolean functions lacking. Taking all non-zero activities 
as having value 1 turns the activity matrix into the ad- 
jacency matrix. Therefore we also consider the predictive 
power of the leading left eigenvector e = (ei , 62, . . . , ejv) of 
the adjacency matrix. In situations without global knowl- 
edge on the system, we may want to compare dynamical 
impacts of a few nodes, for which the network neighbour- 
hood is known. Then we can resort to the strength cji or 
the out-degree di as centralities of node i based on local in- 
formation. Table [T] summarizes the four node centralities 
under consideration. 

Results for random networks. — Let us investi- 
gate the dynamical impact of nodes and its prediction 
by centrality measures (cf. Table [I]) on random Boolean 



networks with N — 500 nodes and connectivity parame- 
ter K = 2. See Methods for details. As shown in Fig- 
ure [2|^a), the long-term impact of perturbations is best 
predicted by the leading eigenvector e of the activity ma- 
trix in the whole range of sensitivity. Prediction by the 
leading eigenvector e of the adjacency matrix is inferior 
to that by e in the supercritical regime (s) > 1. When 
reaching (s) — K — 2, predictive powers become equal 
again, because all Boolean functions are exclusive-Or or 
its negation. Then all network connections have activity 
value 1 and adjacency and activity matrices are the same. 
The superiority of the eigenvector e as a predictor is in 
agreement with the analytic arguments given in the previ- 
ous section. Slightly above the critical sensitivity value 1, 
predictive power shows a peak for all centrality measures 
considered. Further analyses of the dynamics are neces- 
sary to understand the variation of the predictive power 
with average sensitivity, especially the minimum of Ve at 
(s) « 0.7. 

Figure [2](b) displays predictive power at short times, 
here t = 1. As expected, strength a is the best predictor in 
this case. Predictions by the out-degree vector d perform 
second best but significantly worse than those by strength 
a. 

The results in the upper panels of Fig. [2] are obtained 
under synchronous update of the whole system, as de- 
fined by Eq. ([!]). In order to check the robustness of the 
results, we repeat the simulations under stochastic asyn- 
chronous update according to Equation ( [12^ . The results, 
shown in panels (c) and (d) of Fig. [2] are qualitatively 
similar to those obtained under synchronous update. In 
the supercritical regime, however, the predictive power of 
all four centrality measures is increased when the updat- 
ing is asynchronous instead of synchronous. Thus damage 
spreading is easier to predict under asynchronous update, 
at least with the four centrality measures studied here. 
This effect must be rooted in the interplay between the 
update order and the network structure. For instance, the 
damage definitely heals when the perturbed node receives 
the first update before all its predecessors. The frequency 
of this happening decreases with the out-degree di and in- 
curs an additional dependence of dynamical impact on the 
centrality measure d. 

Simulations at different network sizes (N = 50, N — 
100, not displayed) yield similar results for all four combi- 
nations of long- or short-term spreading and synchronous 
or asynchronous updates. The predictive power of all four 
centrality measures remains constant or increases with sys- 
tem size. Furthermore, we investigate networks with pos- 
itive feedback only, i.e. without negation of signals (see 
Methods). For N = 500, long-term prediction {t = N) 
and synchronous update, the predictive power Vc of the 
eigenvector is on average 0.80±0.18; that of strength (Va) 
is on average 0.63 ± 0.14. 

Switching between attractors. — The long-term 
behaviour of Boolean dynamics is determined by attrac- 
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Fig. 2: Quality of prediction of dynamical impact in random Boolean networks at varying average sensitivity (s). Symbols 
distinguish the centrality measures out-degree d (green o), strength a (blue A), and the principal eigenvectors e and e of the 
activity matrix (red □) and the adjacency matrix (black o). The four panels represent combinations of long- or short-term 
prediction with deterministic synchronous or stochastic asynchronous update. Each data point gives the rank order correlation 
(cf. Methods) with dynamical impact h{t), averaged over 100 independent realizations of random Boolean network with given 
sensitivity (s), K = 2, and A'^ = 500 nodes. The error bars indicate the standard deviation over realizations. 
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Fig. 3: Power of centrality measures for predicting if a per- 
turbation changes the attractor reached. Symbols for the cen- 
trality measures are the same as in Fig. [2] Each data point is 
an average over 100 random Boolean networks with = 50 
nodes and connectivity parameter K = 2. Error bars indicate 
the standard deviation over the random network ensemble. Er- 
ror bars are scaled down by the factor 0.2 to avoid overlapping. 
Additional points (-1— symbols with dashed lines) are the frac- 
tion of realizations for which the principal eigenvector e gives 
the best prediction of all four measures. 



tors. These are minimal ergodic sets in state space. Un- 
der synchronous update, an attractor of length I is a se- 
quence of states a:(0), a;(l), . . . , x{l—l) such that f{x{t)) = 
x{[t + 1] mod /) for all t € {0, . . . ,1 - 1}. 



It is natural to ask if a perturbation in the initial condi- 
tion will cause the system to arrive at a different attractor. 
For this investigation, we define the attractor impact h'^ of 
node i as the fraction of initial conditions where a pertur- 
bation at node i changes the attractor eventually reached. 
At difference with dynamical impact hi{t), attractor im- 
pact h[ does not set an explicit time t after which to de- 
termine the spreading or healing of the perturbation. On 
the other hand, h[ does count the perturbation as healed 
whenever the perturbed and unperturbed dynamics even- 
tually become equal up to a time lag. 

Figure [3] shows the predictive power of the centrality 
measures for attractor impact of nodes. For averaged val- 
ues, the performance comparison yields Ve > Ve > Va > 
Vd, being the same ordering as for predicting long-term 
dynamical impact. Due to fluctuations around the aver- 
ages, this ordering does not hold for each single realiza- 
tion. At each considered value of the average sensitivity, 
a fraction at least 3/4 of the realizations has e as the best 
predictor. Subcritical networks, (s) < 1, are disregarded 
because here most realizations do not have more than one 
attractor. For (s) > 1.7, attractor search exceeds available 
computer time. 

Dynamical impact in a real networlt. — Let us 

test the performance of predictors on a non-random net- 
work now. Hclikar et al. describe signal transduction in 
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Table 2: Predictive power of centrality measures for the fi- 
broblast signal transduction dynamics. The upper part of the 
table considers the original system. The lower part is for the 
system after removal of the nine nodes providing constant in- 
put. Each line of the table is a scenario defined by the update 
mode and the choice of long- or short-term dynamics. The bold 
number indicates the maximum in each line. 





all nodes 






V. 


Vd 


sync 


t = 1 


0.671 


0.454 


0.930 


0.455 


t = 100 


0.920 


0.734 


0.746 


0.523 


async 


t = N 


0.706 


0.528 


0.904 


0.564 


t = lOOiV 


0.854 


0.694 


0.748 


0.542 




only core nodes 






V, 




sync 


t = 1 


0.633 


0.467 


0.946 


0.528 


t = 100 


0.911 


0.777 


0.738 


0.611 


async 


t = N 


0.658 


0.543 


0.919 


0.656 


t = WON 


0.834 


0.731 


0.741 


0.631 



fibroblasts with a detailed Boolean network 25 30 . The 
network has N = 139 nodes and 548 connections, in- 
cluding 59 self-couplings. We choose this network be- 
cause of its size and because of its large number of inter- 
twined feedback loops of various lengths I, see also Figure 
1 in [30|. We quantify the abundance of feedback by the 
trace of the l-th power of the adjacency matrix A, find- 
ing tr(A) = 59, tr(A2)/2 = 568, tT{A^)/3 = 82455 and 
tr(A'*)/4 = 13921796. The nodes fall into two classes. 
There are 9 input nodes with a self-coupling. Each of 
these applies the identity function to its own state, not re- 
ceiving signals from any other node. These nodes provide 
constant but choosable input to the rest of the network. 
Each of the remaining 130 nodes receives an input from 
at least one other node in this set. We call these the core 
nodes. The in-degree of nodes varies from 1 to 14, the 
out-degree varies from 1 to 28. 

In table |2] we summarize the predictive power of cen- 
trality measures for dynamical impact of nodes in the fi- 
broblast network. Also for this network, the leading eigen- 
vector e of the activity matrix is the best predictor of a 
node's ability to cause long-term spreading of a perturba- 
tion. Short-term spreading is best predicted by a node's 
strength ai . Table [3] shows the five nodes with the largest 
dynamical impact and their ranks with respect to the cen- 
trality measures. Prediction of these ranks by the central- 
ity measures is not perfect. However, the leading eigen- 
vector e of the activity matrix correctly identifies four out 
of the five nodes with the largest impact. 

Table |3] and the lower part of Table [2] are obtained 
for the fibroblast network after removal of the nine in- 
put nodes. These nodes indefinitely sustain their state. 
Therefore a perturbation at an input node i never heals, 
yielding maximal dynamical impact hi (<) = 1 for all times 
t. Reduction to the dynamical core by the removal of the 
input nodes allows for a less biased assessment of predic- 



Table 3: The five core nodes of the fibroblast network with the 
largest dynamical impact and their ranks with respect to the 
four centrality measures. Synchronous update is performed on 
the core of the network, after removal of the nine input nodes. 
Dynamical impact hi{t) measures spreading over t = 100 time 
steps. 



node i 


h^{100) 


n(e) 


n{e) 


n{cr) 


n{d) 


Src 


0.7707 


1 


1 


1 


1 


B-Arrestin 


0.7061 


4 


4 


9 


14 


GRK 


0.6458 


16 


27 


17 


43 


PIP2-45 


0.5961 


2 


12 


4 


4 


PKC 


0.5910 


3 


13 


3 


5 



five power. 

Discussion. — Even in random Boolean networks, 
nodes exhibit significant differences in dynamical impact. 
These differences are captured well by local and global 
centrality measures. From a linearization of the dynam- 
ics, these centralities arise as column sums and eigenvec- 
tors of a network matrix. Practical applications therefore 
benefit from efficient computation as compared to costly 
direct simulation of damage spreading. For the fibroblast 
network, a modern workstation calculates the principal 
eigenvectors e and e in < 10"^ s, to be compared to sev- 
eral minutes for sampling over perturbations. Detection 
of attractor switching with N = 500 nodes takes time on 
the order of hours, even days for some of the instances. 

An important implication for networked systems is the 
possibility of capturing response to perturbations based 
on incomplete information about the system's structure. 
Here, prediction of dynamical impact only uses the activ- 
ity matrix while being ignorant of the actual rule tables. 
No distinction between positive and negative feedback en- 
ters the calculation. Comparing the case of randomly 
mixed feedback types to that of only one type, we find 
that a combination of feedback types is neither necessary 
nor detrimental for prediction of dynamical impact. 

The scenario of predicting node impact based on partial 
knowledge is particularly relevant for biological systems 
where not all interactions are known in full detail. A large 
number of measures has been suggested for identifying the 
dynamical centers of biological systems based on the un- 
derlying network structure alone 



31 ■ 33 . Most of these 



approaches provide only an intuitive understanding of the 
assumed correlation between the centrality in the network 
and impact on the dynamics. The present framework, be- 
side its accuracy and computational efficiency, is based on 
a verifiable description of the system's response to per- 
turbations. In particular, it allows to distinguish between 
short- and long-term effects. The result is a detailed set of 
predictions testable in experiments. In how far the predic- 
tions meet experimental outcomes depends on the validity 
of idealizations at two levels: (i) the approximate analysis 
of dynamical impact in this letter; (ii) the Boolean ideal- 
ization to capture the real system's dynamics. 
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Methods. — A random instance of a Boolean network 
with N nodes, connectivity parameter K and expected av- 
erage sensitivity (s) is generated as follows. Each node i 
is assigned a Boolean function fi, drawn from the distri- 
bution 7r(/) cx exp[As(/)]. This distribution is normalized 
and supported by the set of 2^ Boolean functions with at 
most K inputs. Here A is chosen such that the expectation 
value of s{f) under the distribution tt is equal to the aver- 



age sensitivity (s) 20 . Then tt is the unique distribution 



maximizing entropy with the given (s). For each input, on 
which fi actually depends, a link (j, i) is established with 
the source node j drawn uniformly at random. When this 
would lead to a duplicate or self-coupling, j is discarded 
and redrawn. For the random networks with positive feed- 
back only, we use 7r{/) = 0.5 if / is the AND or the OR 
function, 7r(/) = otherwise. 

Both for the random and the empirical Boolean net- 
works, we estimate dynamical impact of a node i by 
10** runs of the dynamics. For each of these, a state 
a;(0) € {0, 1}''^ is drawn uniformly. Then two replica of 
the system are initialized with a;(0) and at (a;(0))-I-\ The 
fraction of runs where the replica are in different states 
at time t is taken as approximation of hi{t). When hi{t) 
is the same for all nodes i or the largest eigenvalue of 
the network's activity matrix is degenerate, the network 
is discarded and a new independent realization is drawn. 
Discarding of network happens mostly at small (s). It 
does not occur in any of the trials with (s) > 1.2 . 

The dynamics of Equation ([T]) is deterministic with syn- 
chronous update. Alternatively, we consider stochastic 
asynchronous update as follows. At each time step t, a 
node u(t) is drawn uniformly at random and the nodes 
take states 



Xi{t) 



li i = u 
otherwise 



(12) 



in the subsequent time step. The same random sequence 
u{t) of updated nodes is used for the perturbed and the 
unperturbed replica of the system. 

We quantify the predictive power Vy of a centrality mea- 
sure y S {d, e,cr, e} as the rank order correlation with 
dynamical impact Vy = corr(r(/i), r(j/)) using the usual 
Pearson correlation coefficient corr. For a general vector 
V = (ui, W2, . . . , Vn), the rank vector r{v) has entries 

n{v) = 1 + |{j ^ z\v, > v,}\ + ^ i\v, = v,}\ . (13) 
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